IMAGE CODE ESTIMATION 



CROSS-REFERENCE TO RELATED APPLICATIONS 

The following applications disclose related subject matter: Appl. Nos. 
10/...., filed .... These referenced applications have a common assignee with the 
present application. 

BACKGROUND OF THE INVENTION 

The present invention relates to digital image processing, and more 
particularly to compressive encoding of images. 

In many applications of digital image processing, such as JPEG, MPEG or 
DV, image data is compressed by successive operations of DOT (Discrete 
Cosine Transform), quantization, and Huffman (variable length) encoding. DCT 
is a pre-processing operation for image compression which converts blocks of 
spatial domain information of the image into blocks of frequency domain 
information, typically for 8x8 blocks of pixels. Generally, the transformed image 
has a tendency to strong correlations within a neighborhood; DCT processing 
concentrates a majority of the information of the image into the low frequencies. 
Quantization is an effective compression method which divides the DCT 
coefficients by an integer quantization level so as to reduce the precision 
(number of significant bits) of the DCT coefficients. The quantized DCT 
coefficients of a block are scanned from low frequency to high frequency and 
converted into a sequence of pairs of Yun' and 'lever parameters (run length 
encoding). Using a Huffman code table defined by the statistics of an image, the 
sequence of run-level pairs are finally converted to a sequence of Huffman 
(variable length) codewords. The Huffman tables for differing sets of variables of 
JPEG, MPEG, or DV are fixed in the specifications of these standards. 

In these image compression methods, the size of the Huffman code 
generated is strongly dependent on the quantization level. Therefore, it is 
necessary to apply a suitable quantization level to adjust an output code size to a 
target code size. One useful quantization level approach has a feedback loop 
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comparing a target code size and a code size actually generated by the encoder 
circuit. This approach enables estimation and response to a current code size 
accurately because actual output code size is used as a comparison to the target 
code size. However, there is a delay corresponding to the speed of the encoding 
algorithm to output the actual code. As a result, there is a delay for convergence 
to the target code size. 

SUMMARY OF THE INVENTION 

The present invention provides estimation of the code size for digital 

image compression by simplifying the feedback flow for the quantization process 
for DCT coefficients and Huffman encoding. 

This has advantages including more efficient quantization without time 
delay of actual code generation. 

BRIEF DESCRIPTION OF THE DRAWINGS 
The drawings are heuristic for clarity. 

Figures 1a-1b show functional block diagrams of preferred embodiment 
method and preferred embodiment system. 
Figure 2 illustrates partitioning a block. 
Figure 3 shows an approximation function. 
Figure 4 depicts experimental results. 
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DESCRIPTION OF THE PREFERRED EMBODIMENTS 



1 . Overview 

Preferred embodiment image processing methods include low complexity 
estimates of the code size for Huffman (variable length code) encoding a block of 
quantized DOT coefficients; this provides quantization level feedback information 
for selection of a quantization level(s). The method uses a histogram of the non- 
zero DOT coefficient magnitudes of (an area of) a block together with normalized 
code-size functions (a function for representative 1evel"s and depending upon an 
average "run") from the Huffman table. The average "run" in the code is 
estimated from the number of zero and non-zero quantized coefficients in (the 
area of) the block. Figure 1a is a functional block diagram for a preferred 
embodiment method. 

Preferred embodiment digital image systems (such as cameras) include 
preferred embodiment image processing methods. Figure 1b shows in functional 
block form a system (digital stiall camera) which incorporates preferred 
embodiment methods as illustrated in the JPEG compression block. The 
functions of preferred embodiment systems can be performed with digital signal 
processors (DSPs) or general purpose programmable processors or application 
specific circuitry or systems on a chip such as both a DSP and RISC processor 
on the same chip with the RISC processor as controller. Further specialized 
accelerators, such as CFA color interpolation and JPEG encoding, could be 
added to a chip with a DSP and a RISC processor. Captured images could be 
stored in memory either prior to or after image pipeline processing. The image 
pipeline functions could be a stored program in an onboard or external ROM, 
flash EEPROM, or ferroelectric RAM for any programmable processors. 

2. First preferred embodiment 

Figure 1a is a functional block diagram of a first preferred embodiment 
method of quantization level determination and which includes the following 
steps. First prepare a histogram of the magnitudes of the DOT coefficients of a 
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(8x8) block; this enables easy estimation of the histogram resulting from a 
quantization process and reveals the magnitude tendencies which impact the 
actually generated code size. For a fixed-point processor, a definition of the 
histogram bins as the ranges 2"- 2"*^-1 (n=0,1 ,2,...) is especially desirable for 
the simplicity of implementation. Note that 0 coefficients do not appear in the 
histogram. Also, quantization amounts to integer division by the quantization 
level, so quantization levels as powers of 2 permit binary shifting for the division. 

Many of the specifications in standards such as JPEG, MPEG, take the 
DCT coefficient quantization level to depend upon the DCT frequency; that is, a 
non-flat quantization matrix. Such frequency-dependent quantization levels use 
the fact that the sensitivity of the human visual system to high spatial frequency 
components in an image is less than that to low spatial frequency components. 
In this case, partitioning the DCT coefficient block into several areas which 
correspond to spatial frequency ranges increases the accuracy of the estimation 
of generated code size. Figure 2 shows an example of this partitioning into four 
areas and is applicable to the DV standard encoding process. Note the zig-zag 
scan order of the DCT coefficients aligns to this partitioning so that the areas are 
sequentially scanned. Thus, if each DCT block is partitioned into j areas and the 
magnitudes of the DCT coefficients can be represented by Ambits and the 
histogram bins are taken to be the ranges 2''- 2"*^-1 (n=0,1 ,2,..., define 
the following histogram variable: 

H{ ij ): the number of coefficients in bin 2'- 2'*^-1 (/ = 0,1,2..., in area ; 

The preferred embodiment methods use such a histogram with bins of the 
ranges 2" 2''^^-1 for a fast approximation of the quantization process. In 
particular, for the case of the quantization levels defined in a specification under 
consideration confined to 2^ (Af=0,1,2...), the quantization level changes are 
achieved by shifting values of each bin of the histogram. That is, with 
quantization level 2^ the number of quantized coefficients, Hq{ iJ ), within the bin 
2'-2'*^-1 (/=0,1,2..., 7V-1) in area j is given by: 
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Hq{iJ)^ 



H(i-^MJ) (0<i<N-M-l) 
0 (others) 



Even if the quantization levels defined in the specification under consideration 
are arbitrary integers, the histogram after the quantization Hq{iJ) can be 
represented approximately by scaling a graph of the distribution given by the 
original histogram H{ ij ). 

The variable-length code size of the block after quantization is estimated 
from the given Hq{ij) by simplifying the Huffman table. Inputs for the Huffman 
table are the parameters "run" and level", so consider an expressions for these 
parameters using Hq{Uj), Note that the sign of a coefficient typically is the last bit 
of the codeword, so coefficient differing by sign have the same magnitude and 
same codeword size. The parameter "run" is the number of zero coefficients 
between significant coefficients when scanning the DCT block from the low 
frequency (upper lefthand portion in Figure 2) to the high frequency (lower 
righthand portion), the so-called zig-zag scanning. If the total number of 
coefficients in area j is denoted A{j\ the parameter "run" in area j can be 
estimated as the number of 0 coefficients divided by the number of non-zero 
coefficients. Now the number of non-zero coefficients in area j equals 
^ Hq{i, j) , so estimate "run" by r as follows. 



As to the parameter "level", it already is implicit in the histogram bins for 
the magnitudes of the DCT coefficients in area j. Therefore, if an average value 
is utilized as a representative for the bin of the range 2' - 2'^^-1 , the parameter 
"level" can be estimated for coefficients in the bin by r(i): 



A{j)-YHq{iJ) 
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This can also be simplified to the maximum value 2'^ -1 in the case that the 
specification under consideration has a restrictive upper limit on the generated 
code size. That is, 



In any case, the representative value r(i) is not dependent on the area j. 
Now, r'O) and r(i) provide input parameters for the Huffman table. If the code 
size of the entry "run" = r and "level" = / in the Huffman table under consideration 
is denoted as T( r, / ), then the code size cO) for the coefficients in area j of the 
DCT block can be estimated as follows. 



In this equation, T(r, I) is only applied to the representative value, r(i). for each 
bin /. Therefore, for this estimation, an original Huffman table can be reduced to 
the table restricted to the representative "level"s. However, this calculation 
contains an undesirable division operation from the definition of r'(/). The 
division can be avoided by using a normalized function r(xj) for each bin / 
where x is a normalized variable of the number of non-zero coefficients 
applicable to all areas. As shown in the expression for r '(/), it is determined by 
the fixed number of coefficients in area j\ A(jX and the variable number of non- 
zero coefficients in area j\ i:Hq(ijy By normalizing (scaling) the number ^(/") to a 
convenient Meger A (=a(j)A(j)) and the number I^HqiiJ) to l.Hq{i) (=a(j)lHq(iJ)X 
a function of the Huffman table r(r'(/), /'(O) can be re-defined independent of 
area j\ Thus, code size c(j) can be represented as follows : 



That is, X is the number of non-zero coefficients in AQ) normalized to a total olA 



r{i)^T'-\ 
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This can also be simplified to the maximum value 2'^ -1 in the case that the 
specification under consideration has a restrictive upper limit on the generated 
code size. That is, 



In any case, the representative value IXO is not dependent on the area j. 
Now, r'O) and r(i) provide input parameters for the Huffman table. If the code 
size of the entry "run" = r and "level" = / in the Huffman table under consideration 
is denoted as T{ r, / ), then the code size cO) for the coefficients in area j of the 
DCT block can be estimated as follows. 



In this equation, T(r, I) is only applied to the representative value, l'(0^ for each 
bin /. Therefore, for this estimation, an original Huffman table can be reduced to 
the table restricted to the representative "level"s. However, this calculation 
contains an undesirable division operation from the definition olry). The 
division can be avoided by using a normalized function T'(xj) for each bin / 
where x Is a normalized variable of the number of non-zero coefficients 
applicable to all areas. As shown in the expression for r '(/), it is determined by 
the fixed number of coefficients in area j\ A{]\ and the variable number of non- 
zero coefficients in area j, lHq{ij'y By normalizing (scaling) the number ^(/) to a 
convenient integer A (=a(j)A(J)) and the number I^Hq{ij) to i:Hq{i) (=a(j)I.Hq(ij)), 
a function of the Huffman table T(ry), can be re-defined independent of 
area j\ Thus, code size c(j) can be represented as follows : 



That is, X is the number of non-zero coefficients in AO) normalized to a total of ^ 



/'(0 = 2'"'-l 



N-l 
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coefficients; hence, the average "run" in Aff) is r'Q) = (A-x) /x and so the 
definition T'(xj) = T( (A -x) /x, V(i)) essentially precomputes the division in r'(j). 
Figure 3 shows an example of expressions for T\x,i) for /=0,1 ,...,7 which are 
applicable to the Huffman table in the DV specification. In Figure 3, the number 
of coefficients is normalized to >A=128. It is necessary to define T\x,i) only for 
/=0,1 ,2,3,4 because the defined table for "level" over 32 has the corresponding 
size of the Huffman code depending only on "run". 

As a result, a total code size, 5, for the DCT block can be approximated as 

j 

Thus, a total code size for each block of DCT coefficients can be calculated by 
the summation of the code sizes of the coefficients in the areas in the DCT block 
which, in turn, is estimated by the histogram for each area by simplified operation 
for the quantization level and the Huffman coding {T'(x,i)) without performing an 
actual encoding. 

As illustrated in Figure 1a, the code size estimation for a tentative 
quantization level is then used to select a quantization level for subsequent 
variable-length encoding. 

3. Experimental results 

Figure 4 shows an result of the preferred embodiment code size 
estimation method applied to the DV image encoding process using the 8x8 
block partitioning shown in Figure 2 and with the normalized code size 
approximations shown in Figure 3. Most of the estimation (code sizes estimated 
for 1350 macroblocks) has an accuracy ranged from 60% to 100%. On average, 
the accuracy of this estimation method is 87%. 

4. Modifications 

The preferred embodiment code size estimation methods can be varied in 
various ways while preserving the feature of estimating the code size from a 
histogram of coefficient magnitudes together with the normalized code size 
functions from the Huffman (variable length code) table. 
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For example, the DCT coefficient block size could be increased (e.g., 
16x16) or decreased, the number of areas in a block could be varied from 1 to 
any convenient number, normalization size A for the normalized code size 
functions T^xJ) can all be varied, and so forth. The same code size estimation 
as a function of quantization level could be applied to coefficients of a wavelet 
transform in place of the DCT. 
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